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Abstract 

We consider the problem of designing demodulators for linear vector channels with memory that 
use reduced-size trellis descriptions for the received signal. We assume an overall iterative receiver, 
and for the parts of the signal not covered by the trellis description, we use interference cancelation 
based on the soft information provided by the outer decoder. In order to reach a trellis description, a 
linear filter is applied as front-end to compress the signal structure into a small trellis. This process 
requires three parameters to be designed: (i) the front-end filter, (ii) the feedback filter through which 
the interference cancelation is done, and (iii) a target response which specifies the trellis. Demodulators 
of this form have been studied before under then name channel shortening (CS), but the interplay 
between CS and interference cancelation has not been adequately addressed in the literature. In this 
paper, we analyze two types of CS demodulators that are based on the Forney and Ungerboeck detection 
models, respectively. The parameters are jointly optimized based on a generalized mutual information 
(GMI) function. We also introduce a third type of CS demodulator that is in general suboptimal but has 
closed form solutions for all parameters. Moreover, signal to noise ratio (SNR) asymptotic properties 
are analyzed and we show that the third CS demodulator asymptotically converges to the optimal CS 
demodulator in the sense of maximizing the GMI. 

This paper will be presented in part at the IEEE 26th annual international symposium on personal, indoor and mobile radio 
communications (PIMRC) in Hongkong on August 30 - September 2, 2015. 
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I. Introduction 

For intersymbol interference (ISI) channels, Forney [|T| showed that the Viterbi Algorithm 
(VA) Q implements maximum likelihood (ML) detection. However, the complexity of the VA 
is exponential in the memory of the channel which prohibits its use in many cases of interest. 
As a remedy. Falconer and Magee proposed in 1973 the concept of channel shortening [[^. 
The concept is to filter the received signal with a channel shortening filter so that the effective 
channel has much shorter duration than the original channel, and then apply the VA to the shorter 
effective channel. 

CS demodulators have a long and rich history, see 0-0. Traditionally, the CS demodulators 
have been optimized from a minimum mean square error (MMSE) perspective Q-[12|. Two 
exceptions from this are the papers [13| and [14|. In pA| , the authors attempt to minimize the 
error probability of an uncoded system which leads to a new notion of posterior equivalence 


between the target response and the filtered channel. However, since [13| works with uncoded 


error probabilities, the analysis in does not adequately address the case of coded systems 
and Shannon capacity properties. 

To the best of our knowledge, the first paper that works with capacity-related cost measures 
is 0. In p4| the authors consider the achievable rate, in the form of generalized mutual 


information (GMI) [15|-|19|, that the transceiver system can achieve if a CS demodulator is 


adopted. However, [ 141 is limited to ISI channels only, and the design method in [ 141 of the CS 


demodulator is in fact not always possible to execute. The limitations of [ 141 were first dealt 


with in [ 181, which extended the CS concept to any linear vector channel and resulted in a closed 
form optimization procedure. 


In this paper we generalize the idea in [18| to iterative receivers. With iterative receivers it 
is reasonable to expect that better performance can be reached by allowing the parameters of 


the CS demodulator to change in each iteration. A limitation in [18| is that the CS demodulator 
does not take the prior information into account, rendering its design static in all iterations. We 


extend the static CS demodulators described in [18| and aim at constructing a CS demodulator 
that takes soft information provided by the outer decoder into account so that the parameters 
of the CS demodulator are designed for a particular level of prior knowledge. This procedure 
includes an interference cancelation mechanism to deal with the signal part that can not be 
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handled by the trellis search. Preliminary results for CS demodulators in iterative receivers are 


available in [20|, but this paper non-trivially advances the state of the art. 

A closely related concept is delayed-decision-feedback-sequence-estimation (DDFSE) first 


investigated in [211. However, in DDFSE the interference cancelation is done within a single 
iteration, and not between the iterations of an iterative receiver. 

The paper is organized as follows: The linear vector channel model is described in Section II 
while the general form of the CS demodulators and the iterative receiver structure are introduced 
in Section III. In Section IV we analyze three types of CS demodulators for finite length linear 
vector channels and in Section V we deal with ISI channels as asymptotic versions of the results 
established in Section IV. The SNR asymptotics of the CS demodulators are discussed in Section 
VI. Numerical results are provided in Section VII and Section VIII summarizes the paper. For 
improved readability we have deferred some long proofs and derivations to Appendices A-K. 


A. Notation 

Throughout the paper, a capital bold letter such as “ A ” represents a matrix, a lower case bold 
letter “ a ” represents a vector and the capital letter “ A ” represents a number. “ A -< 0 ” means 
matrix A is negative definite while A>-0” means A is positive definite. Matrix I represents the 
identity matrix and in general the dimension will be omitted; when it cannot be understood from 
the context, we let Ik represent a. K xK identity matrix. Our superscripts have the following 
meanings: is complex conjugate, “T” is matrix transpose, “H” denotes the conjugate 

transpose of a matrix, “ — 1 ” is matrix inverse, “ —T” means both matrix inverse and transpose, 
and H” denotes both inverse and conjugate transpose of a matrix. In addition, “oc” means 
proportional to, “E[] ” is the expectation operator, “Tr() ” takes the trace of a matrix, “Re{ } ” 
returns the real part of a variable, “0” is the Kronecker multiplication operator, vec(A) is a 
column vector containing the columns of matrix A stacked on top of each other, and “ [A, B] ” 
is the set of integers {k:A<k<B}. 

Furthermore, we say that a matrix A is banded within diagonals [— 01 , 1 / 2 ] (oi,U2>0), if the 
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{k,i)th element A{k,i) satisfied 

A{k, £) = 0, £ — k > ui or k — £ > V 2 - 

Moreover, we define two matrix operators [ ]y and [ sueh that A = [A]^ 
banded within diagonals [—i/, v] where is eonstrained to zero. 


II. System Model 


We eonsider linear veetor ehannels aeeording to 


y = Hx + n 


( 1 ) 


where y is an iV x 1 veetor of reeeived signal, a; is a ii' x 1 veetor eomprising unit energy eoded 
symbols that belong to a eonstellation X, H is an NxK matrix representing the eommunieation 
ehannel and n is zero-mean eomplex Gaussian noise veetor with eovarianee matrix A^"o/. Denote 
Xk as the kth element of x and as the kth eolumn veetor of H, ([T]) ean be rewritten as 

K-l 

y = ^hkXk+n. ( 2 ) 

k=0 

In an iterative reeeiver, the feedbaek from the outer deeoder ean be utilized in the demodulator 
to improve the performanee. As the outer deeoder provides the demodulator with a posteriori 
probability (APP) and extrinsie information (in terms of bit log-likelihood ratio (LLR)) p2|, 
1^, side information is present about the symbols x and we represent this by the probability 
mass funetion pk{s)=F{xk = s), 0<k<K—l. Note that the side-information does not eonsider 
the dependeney among the symbols, but are symbolwise marginal probabilities. This refleets 
the situation eneountered in iterative reeeivers with perfeet interleaving. In those eases, the prior 
probabilities provided from previous iterations are assumed independent, i.e., P(a; = s) = Pkis). 
Due to the perfeet interleaving assumption, the demodulator ean compute x=[x^ ■ ■ ■ xkV = 
E(a;) in a per-entry fashion as 



'Note that v\ refers to the number of upper diagonals of A that are non-zero. We have this convention in order to subsequently 
follow standard notation for Toeplitz matrices (421. 
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Further, the K x K diagonal matrix P = E[a;*^] = E[a;*^] that reflects the quality of the 
side information can be calculated, and the expectations are computed with respect to the prior 
distribution pk{s). 

The task of the demodulator is to generate soft information about the symbols in x given the 
observable y and the side information {pfc(s)}- The optimal demodulator is the maximum-a- 


posteriori (MAP) demodulator [24|, [25| which evaluates the posterior probabilities P(a:fc = s|?/). 
However, the number of leaves of the search tree corresponding to the MAP demodulator is 
in general \X\^ which is prohibitive for most practical applications. The purpose of the CS 
demodulator is to force the signal model to be an lower triangular matrix with only v + 1 
(0<z/<iT —1) non-zero diagonals by means of a linear filteij^ i> is referred to as the memory 


length of the CS demodulator. Then, a BCJR [261 demodulator can be applied over a trellis with 
\X\'' states. Moreover, since there is side information present about x, the parts of H acting as 
noise can be partly eliminated by means of interference cancelation through the prior mean x. 

Notice that if we set v = K — 1, the search space of CS demodulator is no longer a trellis 
but corresponds to the original tree and is therefore equivalent to MAP. On the other hand, 
the linear MMSE demodulator with parallel interference cancelation (LMMSE-PIC) [|27|-[29| 
is a special case of CS demodulation with i/ = 0. In the LMMSE-PIC, the BCJR is trivial 
since different symbols are assumed to be independent after the front-end filtering. Therefore 
the CS demodulator is a generalized framework that includes both the MAP and LMMSE-PIC 
demodulators. The CS demodulator can also be viewed as an extension of the EMMSE-PIC to 
include a trellis search, where the parameters for the front-end filter, the interference cancelation 
and the trellis search process are jointly optimized. 


III. The General Eorm oe the CS Demodulator 
We state two lemmas first that will be useful later. Eemma can be verified straightforwardly. 

Lemma 1. Let Ai and A 2 be two KxK matrices, Ai is invertible and banded within diagonals 
[-U, a]. If [A~%=[A 2 ]u, then 

Tr(AiA2)=Tr(/). 


^For finite length linear vector channels such as multi-input multi-output (MIMO) channel, “filtering” means matrix 
multiplication. 
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Proof: Let A‘j, = A 2 — A^^, then [Af\^ = 0 and A 3 = [A 3 ]\j,. As Ai = [Ai]i^, the elements 
along the main diagonal of A 1 A 3 are zero. Therefore Tr(AiA 2 ) =Tr(Ai(A^ ^+A 3 )) =Tr(/). 


Lemma 2. Let Ai and A 2 be two KxK matrices that are banded within diagonals [—z/i, 02 ] 
and respectively. Then the product A 1 A 2 is banded within diagonals [max(—(z/i + 

z/3), 1 - iL), min(i/2 + z/4, K - 1)]. 


A. System Model of the CS Demodulator 

The CS demodulators that we investigate operate on the basis of the following mismatehed 


funetion 


p{il\x,x) = exp(2Re{x^{Vy — Rx)} — x^Gx) 


instead of the true eonditional probability 

1 

p{y\x) = 


(TTiVc 


\N 


exp 


\y—Hx\ 


( 3 ) 


( 4 ) 


Note that p{y\x,x) may not be a valid probability distribution funetion, but this is irrelevant 


for demodulation, see pOl . The matriees V, R and G are referred to as the front-end filter, 
interferenee eaneelation matrix and trellis representation matrix, respeetively. Without loss of 
generality, we have absorbed A'o into V, R and G. Note that Q and Q are equivalent for 
demodulation if we set V = H^/Nq, R = 0 and G = H^H/Nq in whieh ease the CS 
demodulator represents the MAP demodulator without interferenee eaneelation. 

In this paper we will go through three types of CS demodulators that ean be expressed in the 
form Q, but with different eonstruetions of matriees V, R and G that represent different views 
on the domain in whieh the CS should be performed. 

In order to optimize the matriees {V,R,G), we ehoose to work with the GMI whieh is an 
aehievable rate for a reeeiver that operates on the basis of a mismatehed version of the ehannel 
law. The GMI in nats/ehannel equals 


Igmi = -E[logp(?/|*)]+E[logp(?/|a;,a;)] (5) 

where p{y\x) = (I/tt^)/ p(7/|a;, *) exp(—||a;||^)da; and the expeetation is taken over the true 
statisties of the ehannel. Notiee that while finite eonstellations X are almost always used in 
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practice, they are hard to analyze. In order to obtain a mathematieally traetable problem, here 
we use a zero-mean, unit varianee, eomplex Gaussian eonstellation for eaeh entry of x. With 
eomplex Gaussian inputs, the trellis diseussed earlier has no proper meaning as the number of 
states is infinite even for finite u. However, the eomplex Gaussian assumption is only made in 
order to design the reeeiver parameters (V, R, G). 

Theorem 1. With the system model in (|^, the GMI defined in Q reads, 

Igmi{V,R,G) = log(det(/-fG))-Tr(G)+2Re{Tr(Vi^-i^P)} 

-Tr{{I+G)-^{V{NoI+HH^)V^-2Re{VHPR^}+RPR^)). (6) 

Here we make the same assumption as in that I+ G is positive definite, otherwise the GMI 
is not well defined. The proof of Theorem is given in Appendix A. 

With any parameters (V, R, G), the GMI ean be ealeulated from (|^, although they may not 
be optimal in the sense of maximizing the GMI. We illustrate Theorem with two examples. 

Example 1. Extended Zero-Forcing filter (EZF). We extend the zero-Forcing filter / [Ji] / to only 
partly invert the channel so that a trellis search is necessary after the EZF front-end filter. In 
view of the CS demodulator, we can select the parameters in Q as: 

V = {I+G){H^H)-^H^, P = 0, 

and then optimize ^ over G. In order to satisfy the constraint of having a trellis with \Xf 
states, we should have G= [G]^. The optimal G, in the sense of maximizing will be shown 
(Theorem ^ to satisfy, 

[{I+G)-% = No[{H^H)-%. 

Utilizing Femma the GMI in (|^ for the optimal G equals 

Jgmi = log(det(/+G))+Tr(/-iVo(/f''if)-i(/+G)) = log(det(/+G)). 


Example 2. Truncated Matched filter (TMF). As previously mentioned, the MAP demodulator 
^ can be written in the form (|^ by setting V = H'^/Nq, R = 0 and G = H^H/Nq. The 


front-end is in this case a matched filter [32] and the BCJR needs to be implemented over the 


Ungerboeck model [33]. In order to reach a trellis with \Xf states, we can truncate G to its 
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Fig. 1: CS demodulator that maximizes the GMI. 


center 2v+l diagonals, i.e., we can use the following parameters in Q." 

V = H^/No, i^ = 0, and G = 

With these choices, the GMI in Q equals 

/gmi = \og{det{I+[H^H/No],))-Ti{H^H{N,I+[H^H],Y\H^H\f). 

B. Constraints on the Parameter R for the CS Demodulator 

Our approach to design a CS demodulator consists of two steps. As illustrated in Figure 
these are: 

• Construction of a signal y based on the received signal y and prior mean x as y = Vy—Rx. 

• BCJR demodulation of y operating on a reduced number of states \Wf. 

This procedure is fully analogous to an LMMSE-PIC demodulator which first subtracts the 
interference, applies a Wiener filter and concludes by a BCJR demodulator that operates with a 
diagonal matrix G. 

As mentioned earlier, optimization of the demodulator will be made on the basis of CMI which 
is evaluated for the statistical model of the tuple {x,y). The statistical behavior of {x, y) may be 
superior to that of the original {x, y) as the former tuple corresponds to a statistically different 
channel than the true one. Hence, the CMI may very well exceed the channel capacity. Moreover, 
the computed value of GMI may have little relevance for the performance of the transceiver 
system. In order for GMI to have bearing on performance, it is critical to put constraints on the 
matrix R as our next example will show. 

Example 3. Let the system model be 

y = X + n 
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with arbitrary noise density Nq and y, x and n are K xl vectors. Assume perfect feedback 
information, i.e., x = x. The demodulator parameters are taken as V = 0, R = —{l+f3)I, and 
G = 131, 13 an arbitrary positive real value, then the statistical model for y is 

y = Vy - Rx = {1+I3)x. 

The GMl in ^ for the pair (y, x) is 

Igmi {V,R,G) = K{l+\og{l+^)). 

In order to maximize the GMI, the demodulator will ehoose /3—)-cxd to make /gmi(T^, R , G ) 
infinite. This is because, except for using the feedback information for interference cancelation, 
the demodulator uses the prior mean a; as a signal energy via R. A demodulator equipped with 
these parameters will have significant error propagation and does not have much operational 
meaning for an iterative receiver. Thus, we can conclude that unless constraints are put on R, 
the GMI value is not relevant. 

In this paper we shall investigate three different constraints (to be made precise later) for 
R. All three have in common that rather than adding signal energy, the rationale of R should 
be to remove interference. Therefore at the very minimum the diagonal elements of R should 
be constrained to zero, so that the demodulation of each symbol in x does not rely on its 
own prior mean x. Such a constraint is perfectly aligned with the operations of the LMMSE- 
PIC demodulator, where xi is not used for demodulation of xi. Furthermore, the rationale of 
the constraints we impose on R is to follow the principle of extrinsic information: The BCJR 
module should not rely on the prior information when demodulating xg, (this requires more 
than just the diagonal of R to be zero). 

We point out that the fact that the GMI can exceed the channel capacity is a consequence of 
our choice to not include the side information as a prior distribution on x when evaluating the 
GMI. If we did, then the GMI is decaying with increasing quality of the side-information. 

Finally, we acknowledge the fact that a permutation of the columns of H can boost the 
performance of the CS demodulator whenever D < u < K — 1 for finite length linear vector 
channels and this will be briefly illustrated in the numerical result section. However, minimum- 
phase conversions of ISI channels are not beneficial as we will anyway solve for the optimal 
front-end filter V. 
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Fig. 2: CS demodulator and decoding receiver structure. 


C. Receiver Structure with the CS Demodulator 

The overall structure of a receiver that utilizes a CS demodulator is shown in Figure The 
APP output from the outer decoder is used to compute the estimate x and the matrix P. Based 
on the updated P in each global iteration, the optimal CS parameters are found by maximizing 
the GMI in Q. An interference cancelation process is then implemented with the optimal V 
and R to obtain the signal y, which is sent to a memory u BCJR module with the optimal G. 
Moreover, the extrinsic information iteratively exchanged between the BCJR demodulator and 
the outer turbo decoder is also used as the priori information for the transmitted symbols. 

IV. Parameter Optimization for Finite Length Linear Vector Channel 
A. Method I 

Method I has its roots in Falconer and Magee’s paper [[^, but adds an interference cancelation 
step. The system model of the demodulator is 

T{y\x^x) = — Tx — Fx\\^') (7) 

where the following structures of the involved matrices are imposed: 

. W is a K xN matrix with no constraints. 

• F is a KxK lower triangular matrix where only the main diagonal and the first v lower 
diagonals are non-zero, i.e., F is banded within diagonals [0, i/] {D < u < K — 1). v is 
denoted as the memory length of F. Moreover, the main diagonal of F is constrained to 
only contain positive real values. 
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• T is a K xK matrix that is constrained to be zero wherever F can take non-zero values. 
We point out that by setting T = 0, we obtain the same system model as in Q. The constraint 
of F is to shorten the memory for the trellis search in the BCJR module, while the purpose of 
the constraint on T is to cancel the signal part that F can not handle. 

Note that if we identify V = F^W, R = F^T and G = F^F, Q can be rewritten in the 
general form Q, 

f{y\x,x) (X exp{2Re{x^{F^Wy - F^Tx)}-x^F^Fx) 

= exp{2Re{x^{Vy - Rx)}-x^Gx) 

= P{y\x,x), 

and the GMI in (|^ in this case reads, 

Igmi{W,T,F) = log{det{I+F^F))-TT{F^F) + 2Re{TT{F^{WH-TP))} 

-Tt{{I+F^F)-^Li) ( 8 ) 

where 

Li = F^W{NoI+HH^)W^F-2Re{F^WHPT^F}+F^TPT^F. 

It can be verified that with the aforementioned constraints on F and T, the elements of 
R = have the special form of type (a) that is depicted in Figure That is, all diagonal 
elements are zero as well as the lower triangular part of the (z/+l)x(z/+l) right bottom comer. 

In order to optimize ([^ over the parameters (VF, T,F), we first introduce an SxK"^ indication 
matrix fl only consisting of ones and zeros, having a single 1 in each row. S equals the number 
of elements in T that are allowed to be non-zero. Let I(vec(T)) be a vector that contains the 
positions where the vector vec(T) is allowed to be non-zero. Then the value of the kth entry in 
I(vec(T)) gives the column where row fc of 17 is 1. That is, the S'xl vector S7vec(T) stacks the 
columns of T on top of each other but with all elements that are constrained to zero removed. 
With such a definition of 17, and define two KxK matrices as, 

M = H^{NoI+HH^)-^H-I, (9) 

M = P{I+M)P-P, (10) 

the GMI for the optimal W and T is given in Proposition and the proof is in Appendix B. 
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Proposition 1. Define an SxK'^ matrix D = x), the optimal W for the GMI in 

@ is, 

Wopt = F-'^{I+F^F + F^TP)H^{NoI+HH^)-\ (11) 

and when PfO, the optimal T for the GMI in (|^ is given by, 

vec(Topt) = -n^{n{M*®{F{I+F^F)-^F^))n^y^Dyec{F). (12) 


With the optimal W and T, the GMI reads, 

{ hiF), P = 0 

/GMl('W^opt,Topt,F) = <^ 

[h{F) + 6fiF),PfO. 

The functions Ii{F) and 5i (-pfl are defined as, 

h{F) = K+\og{det{I+F^F))+TT{M{I+F^F)), 

SfF) = -vec{FyD^(n{M*®{F{I+F^FyFy)n^y^Dvec{F). 


(13) 


(14) 

(15) 


Remark 1. With the definitions in ^ and ( [70] ), M is the negative of the MSE matrix. By the 
matrix inversion lemma 134], M = — {I + H^H/N^y ^0 and M = PMP+ P^ -P-<0. 
Hence 5i(-F)>0 and it represents the GMI increment from the soft information feedback. 


Before diseussing the optimization of ( [T3] ), we state Theorem that deals with a general GMI 
maximization problem. 


Theorem 2. Define a scalar function I with respect to a KxK matrix G as 

/(G) = /f+log(det(/+G))+Tr(M(/+G)), (16) 

where G satisfies G = [G] y. Then the optimal G that maximizes I is the unique solution that 

satisfies 

[{I+G^p^)-\ = -[My (17) 

With Gopt the optimal I reads, 

/(Gopt) = log(det(/+Gopt)) • (18) 


B 


SfiF) in^ is only defined for P^O, as when P = 0, M = 0 and the inversion in Si{F) is not well defined. The same 


comment holds for 52(G) in 1 25 i. 
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Proof: Taking the first order differential of I with respeet to G and notieing that G is banded 
within diagonals [—u,u], yields ( [Tt] ) after some manipulations. The existenee and uniqueness 
of sueh an optimal solution for ( [T7] ) is proved in [ [^ Theorem 2] and also illustrated in [18 
Proposition 2]. By Lemma Tr([/ + Gopt]”^-^) = —K from ( [T7| , and then ( [T^ follows. ■ 
Optimizing over F in ( [T3] ) when PfO is diffieult and eannot be earried out in elosed form. 
In Appendix C we show by an example that ( [T3] ) is in general non-eoneave. Nevertheless, a 
gradient based numerieal optimization proeedure is utilized to seareh for the optimal F. In the 
zth iteration, we eonstruet 

where Vir*/GMi(VFopt, Topt, i^) is the eonjugate of the gradient VF/GMi(VFopt, Topt, i^) with 
respeet to (the non-zero part of) F, whieh is given in Appendix D. 

When P = 0, if we replaee F^F by G, ([T^ has the same form as (16) in Theorem and 


Gopt is in elosed form. However, it ean be verified that (14) is non-eoneave with respeet to F. 
Henee, if the optimal G is positive definite, the optimal F equals the Cholesky deeomposition 
of the optimal G. Whenever it is not, a gradient based numerieal optimization proeedure is 


utilized to optimize (14). After applying a regularization to foree the optimal G to be positive 
definite, the Cholesky deeomposition of the optimal G is ehosen to be the starting point of F 
in the optimization proeedure both for the ease P fO and P = 0. The optimization proeedure 
has been observed to be highly reliable with sueh initialization. 

Next we state a faet that establishes the eonneetion between the optimal front-end filter W 
and the optimal interferenee eaneelation matrix T in Proposition 

Proposition 2. For PfO and the optimal W and T, the matrix F^iW— Ropf) Is banded 
within diagonals [—z/, K—l] for any F that is banded within diagonals [0, v\. 


Proof: By the definition of Q, we have f2^f2vec(Topt) = vec(Topt) and = 1. Therefore 
10 ean be rewritten as, 

n{M*(^{F{I+F^F)-^F^))n^nYec{Topt) = nYec{F{I+F^F)-^F^ToptM) 

= -rivec(PMP). (19) 

This shows that the elements of the matrix A = F(I + F^F)~P^ToptM + FMP are zero 
wherever T ean be non-zero. Henee A is banded within diagonals [0,z/]. On the other hand. 


September 23, 2015 


DRAFT 







14 


with the optimal W given in (11) and M, M defined in (|^ and (10), we have 

F^{WoptH-T,^,)-{I+F^F) = {I+F^F)F~^^p-\ 


( 20 ) 


Note that F ^ is lower triangular sinee F is lower triangular, I+F^F is banded within diagonals 
[—u,u], and P is diagonal. Utilizing Lemma the r.h.s in (20) is banded within diagonals 


nH- 


[-u,K-l]. Therefore F^{WoptH - Topt) is also banded within diagonals [—u,K — l]. ■ 

B. Method II 

Method II origins from Ungerboeek’s 1974 paper p^ . Different from Method I, an Unger- 
boeek deteetion model (|^ instead of the Forney model (|7]) is applied. The Ungerboeek model has 
been extensively diseussed in [36|-[38|. The system model ([^ in Method II has the following 
eonstraints for the involved matriees: 

• V is a K xN matrix with no eonstraints. 

• G is a if xK Hermitian matrix that satisfies G= [G]i, and G + /;^0. u is denoted as the 
memory length of G. 

• i? is a if X if matrix where the shape ean be speeified and three typieal shapes are of our 
interest and investigated. 

Instead of optimizing matriees {W,T,F) for ([^ in Method I, we now optimize matriees 
{V,R,G) for ([^ direetly. In Method II we use the same definition of the indieation matrix fl 
as in Method I, but now fl eorresponds to matrix R instead of T. We eontinue to let S denote 
the number of elements that are allowed to be non-zero in R. That is, the S'xl veetor f2vec(i?) 
staeks the eolumns of R on top of eaeh other but with all elements that are eonstrained to zero 
removed. Then we have the following Proposition 

Proposition 3. Define an Sxl vector d = Qvec{MP), the optimal V for the GMI in ^ is, 

Uopt = {I + G + RoptP)H^{HH^ + NoI)-\ (21) 

and when PfiO, the optimal R for the GMI in ^ is given by, 

vec(ilopt) = -f2^(ri(M*(8)(/-fG)^^)f2'^)“^d. (22) 

With the optimal V and R the GMI in ^ equals 

IfiG), P = 0 


f^GMl(T2^ opt) R-opt) G) 


I2(G) + S 2 ( G ), P ^ 0 . 


(23) 
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The functions hiG) and 62 ( 0 ) are defined as, 

hiG) = ir+log(det(/+G))+Tr(M(/+G)), (24) 

5fiG) = -d^{n{M*®{I+G)-^)n^y^d. (25) 


The proof is given in Appendix E. Similar to 5i{F) in Method I, 52 {G) >0 represents the GMI 
increment from the soft information feedback in Method 11. 


When PfiO, the optimization over G in (23) uses a gradient based numerical optimization 
procedure and the gradient of /GMi(T^opt, ^^opt, G) with respect to (the non-zero part of) G is 
provided in Appendix F. When P = 0, the optimal G for ( [24] ) is provided in closed form Theorem 
[^ and used as the starting point for the optimization procedure for P 7 ^ 0. However, different 
from Method I, the optimization procedure is concave and the proof is given in Appendix G. 


Although the optimal matrix R is solved for in closed form as in (22), we have not specified 
the constraint (reflected by the indication matrix f2) on R yet. As we are interested in the 
comparison between Method I and Method II, we introduce a first shape, type (a), that is the 
same as for R = F^T in Method I. The second shape, type (b), is that we only limit the diagonal 
elements of R to be zero, the reason is that we intended to eliminate the interference as well 
as possible. At last we introduce shape type (c), in which we limit R to have the opposite form 
of matrix G, that is, the elements of R are constrained to be zero wherever G is non-zero. The 
intention is to only cancel the interference that the trellis search process in BCJR represented 
by G cannot handle. Shape (c) is based on the same idea as Method I, but operates on the 
Ungerboeck model instead of the Forney model. These three types are depicted in Figure [^ and 
a is the memory length constraint for G. In the following we refer to “Method II with an R 
of shape type (a), type (b) and type (c)” as “Method II.a”, “Method Il.b”, and “Method II.c,” 
respectively. 

Similar to Method I, the connections between the optimal front-end filter V and the optimal 
interference cancelation matrix R in Method II are established in Proposition [^ 


Proposition 4. For and the optimal V and R, 

optH]\{i^Yi.) ~ [-^opt]\(i4^^R)- (26) 

That is, the elements ofVoptH and Ropt are equal outside the center diagonals for 
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K-v-l< 


0 

0 



(a) 


(b) 



(c) 


Fig. 3: Three different types of shape of matrix R. 


any G that is banded within diagonals [—v, v\, where for Method Il.a and Method Il.b, r'R = 0 
and for Method II. c, z/r = v. 


Proof: Following similar steps as in the proof of Proposition ( [22l ) can be rewritten as, 
f2vec((/ + G)-^RoptM) = -f2vec(MP). (27) 

It shows that the elements of the matrix A = (/ + G)~^RoptMP~^ + M are zero wherever R 


can be non-zero. On the other hand, with the optimal V given in (21) we have 

V,ptH-R,pt-{I+G) = {I+G)A. 


(28) 


As I + G is banded within diagonals [—a, a], utilizing Lemma (the type (a) of R is sligtly 
different, but it can be verified straightforwardly), with the three types of R defined in Figure 
it can be shown that the r.h.s in (28) is banded within diagonals [— (zz + z/r), z/ + z/r], where zzr = 0 
for the type (a) and type (b) of R and zzr = zz fot the type (c) of R. Therefore Vopt-f^—-Ropt on 
the l.h.s in (28) is banded within diagonals [—(zz + zzj^), zz + zzj^], which proves Proposition ■ 


Remark 2. With the LMMSE-PIC demodulator, we have zz = zzp^ = 0 and Proposition is natural 
and frequently used. With CS demodulators and u f D, VoptH and R are equal outside the 
center 2(zz + zzj^) + l diagonals, not the center 2zz + l diagonals where G are constrained to be 
non-zero. This reveals an interesting fact that the signal part that is not considered in G shall 
not be perfectly canceled inside the center 2zz+2zzr+1 diagonals. The LMMSE-PIC demodulator 
follows this law, but the full nature of the interference cancelation process given in Proposition 
1^ is not seen with LMMSE-PIC as zz = zzj^ = 0. 
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C. Method III 


So far we have diseussed two types of CS demodulators whieh are based on the Forney and 
Ungerboeek models. However, as both need numerieal optimization to obtain the optimal CS 
parameter F or G, we provide a third method that has a closed form solution but is suboptimal 
in general. Method III will rely on the same operations as Method II for P = 0. 

With P = 0, which means that no soft information of the transmitted signal x is available, 
the GMI is given in ( [24] ). The optimal G can be derived from M from Theorem [^ following 
[18 Proposition 2]. By inserting Vopt given in (21) into (j^ and setting i? = 0, we can see that 
the demodulator operates on the mismatched function 


p{y\x) = exp{2Re{x^Vopty}-x^Gx) 

= exp{2Re{x^{I+G)H^{HH^ + NoI)-^y}-x^Gx) 

= exp{2Re{x^{I+G)x}-x^Gx) (29) 


where x = H^{HH^+NoI)^^y is the LMMSE estimate. 

As can be seen in ( [29| ), the trellis search is based on x. With soft information we can replace 
the LMMSE estimate x by the LMMSE-PIC estimate which we denote as x. That is, instead 


of (29) we now operate on the mismatched function 


p{y\x, x) = exp(2Re{ai^(/+G)ai} —ai^Gai) 


(30) 


where G has the same banded shape as in Method II, i.e., G= [G]^. 

The LMMSE-PIC estimate x is constructed as follows. As we prefer to handle the interference 
through the trellis search process, the interference cancelation should not be present within the 
memory length constraint u. In other words, the signal vector after the interference cancelation 
process that is used to form the fcth symbol of x is denoted as y^ and defined as 


yk=y- Kxn 

neAk 


(31) 


where Ak = {0<n<K — l\ n^ [max(0, k — u),mm(k + u, iT —1)]}. 


Denote Pn as the nth diagonal element of P, the Wiener filtering coefficients |39| for the fcth 
symbol are calculated through 


Wk = hl{H^CkH+Nol) ' 


(32) 
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Fig. 4: An graphical overview of Method III with A'= 4 and = 1. 


Ck{n) = 


where is a diagonal matrix with the nth diagonal element defined as 

1 - Pn, keAk 
1 , otherwise. 

The LMMSE-PIC estimate x is then obtained through 

x=[wiy^ W2y2 ■■■ wkVkV = '^y-Cx 
with the coefficient matrix W and interferenee caneelation matrix C defined as, 


W 2 ■ ■ ■ w 


K i ’ 


C=[WH]\,. 


(33) 


(34) 


(35) 

(36) 


Putting X in (34) back into (30), the system model we operate on reads, 

p{y\x,x) = exp(2Re{ai^((/+G)iy^ —(/+G)C;r)}—(37) 
Note that (37) is a also speeial ease of ^ by identifying the front-end filter V = {I+ G)W 


and the interference eancelation matrix R={I+G)C. The GMI in (|^ in this ease reads, after 
some manipulations. 


Jgmi(G) = iT+log(det(/-fG)) +Tr(M(/ + G)) (38) 

with M defined as 

M = WHPC^ + WH-PC^+{WHPC^+WH-PC^)^ 

-W{HH^+NoI)W^-CPC^-I. (39) 
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It can be verified that M is the negative of the updated MSE matrix, that is, 

M = —E[(ai ——= —E \^{x — Wy + Cx){x — Wy + Cx)^]^ . 
The optimal G for (38) is obtained from Theorem and the optimal GMI is 

-^GMI (Gopt) — log(det(/+Gopt))- 


An graphical overview of Method III for K = A and z/ = 1 is illustrated in Figure]^ Note that for 
any band shaped matrix G with memory length v, the interference cancelation matrix (/+G)G 
is zero along the main diagonal. Therefore in GMI sense. Method III will not outperform Method 
Il.b. But the GMI of Method III may outperform the GMI of Method II.c, as it can be verified 
that a type (c) R has zeros at the positions where (/+G)G is zero, therefore Method II.c has 
less degrees of freedom (DoFs) for designing R than Method III. 


Remark 3. Proposition also holds for Method III with z/r = z/. As WH — C = [WH]^, 
by Lemma {I + G){WH — G) is banded within diagonals [— 2z/, 2z/], which shows that, 

[{I+G)WH]\2.= [{I+G)C]\2u. 


V. Parameter Optimization eor ISI Channel 


In this section, we extend the CS demodulators to ISI channels for all three methods. The 
difference from finite length linear vector channels is that with ISI channels the channel matrix is 
infinitely large, but this can be dealt with using [|^-[|^. The formulas for the achievable rates 


in (|^, ([^ and ( [38| ) can be directly applied to ([^, but as the achievable rate Jgmi (as a function 
of the specified CS parameters) is then dependent on the block length K, we are interested in 
the asymptotic rate 


I = lim — Jgmi- 

K^oo K 


Ideally, in the ISI case the front-end matrices W and V correspond to linear filtering op¬ 
erations. The filters are infinitely long, but in practice filters with finite tap lengths are used. 
Therefore, we analyze the properties of W and V with a finite number of taps. In other words, 
we approximate W and V by band shaped matrices and constrain W and V to be zero outside 
the band. However, the band size can be arbitrary and sufficiently large so that we can analyze 
the asymptotic properties. The same holds for the interference cancelation matrices T and R, 
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as they are Toeplitz matriees in the ISI ease and multiplying T and R with x ean be replaced 
by filtering operations. Therefore, T and R are also approximated by band shaped matrices. 
Moreover, the trellis representation matrices F and G are by definition constrained to be band 
shaped with a limited memory length u, and the channel matrix H itself is band shaped Toeplitz 
matrix. Therefore, in the ISI case all matrices we consider are assumed to be band shaped Toeplitz 


matrices. In [401 a complete theoretic machinery for ISI channels is derived, and a result is that 
as K ^oo, the linear convolution in Q can be replaced with a circular convolution]^ We can 
then let H and all other band shaped Toeplitz matrices represent circular convolutional matrices. 


and apply Szego’s eigenvalue distribution theorem [41 j in order to evaluate /. 

In the following, we denote the Fourier series associated to a band shaped Toeplitz matrix 
E that has infinitely large dimensions by E{u). E is constrained to be zero except the middle 
2iVE + l diagonals, i.e., the band size is 2 A^e + 1- -^e is referred to as the tap length for E{oj). 
The Fourier series E{uj) is specified by the vector e = [ e_ 7 VE • • • e_i eo ei ... ], where eo 

is the element on the main diagonal, {k > 0) is the element on k\h lower diagonal, and e_fc 


is the element on fcth upper diagonal of E. The Fourier series E{oj) is defined as [42[ 


iVE 


efcexp(jfc 

fc=-iVE 


u) 


As all quantities are evaluated as the block length grows large, the transform E{(jj) approaches 


the eigenvalue distribution of E (see [41 j, [42[ for a precise statement of this result). The element 
efc can be obtained from E{oj) through an ordinary inverse Fourier formula. 

Furthermore, since the whole data block experiences the same channel, we assume P = al 
(0<a<l), which refers to the quality of the side information. We first state Theorem]^ which 
is an asymptotic version of Theorem with ISI channels. 

Theorem 3. Assume that G{u) and M{u) are the Fourier series associated to band shaped 
Toeplitz matrix G and M, respectively. The dimensions of G and M are infinitely large and 
G is constrained to be zero outside the center 2i/+l diagonals. Moreover, we assume I+GfO 


'’a conceptually simple way to realize this is to insert a cyclic prefix (of ISI channel tap length L) and then make the 
observation that the cyclic prefix has vanishing impact on energy and spectral-efficiency as K^oo. 
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and M -<0. Define a scalar function I with respect to G{uj) as 

I{G{uj)) = 1 + ^ n\ogil+Giu))+M{u)il+Giu)))du, 


and a 1 x 0 vector 


95 ( 0 ;) = [ exp(jc(;) exp(j2ci;) ... exp(jVci;) j"*", 


(40) 


then the optimal Giu) that maximizes I in {40) is 


G'(w)opt = 


where 


Uq = 




■To 


u = -Uorfr^^, 


(41) 


and the real scalar tq, oxl vector ti, and 0 x 0 matrix T 2 are defined as 

1 r 

To = — M{uj)du, 

1 r 

n = —J M{u)p{u)du, 

1 r 

T2 = — / M{oj)p{oj)p{oj)^du. 

Furthermore, the optimal I is 


/(Gopt(t<^)) = 21og(Ko). 


(42) 


Proof: As I+G^-d, in order to maximize (40), we assume that l + G{u) = \U{u)f, with 


U{u) = Uo + up{uj) and u=[ui U 2 ■ ■ ■ u,^]- Then I{G{u)) in (40) can be rewritten as 


/((j(a;)) = 1 + 2log(Mo) + — / M(ci;)(MQ+2Re{Mo'ii+(ci;)} + 'U(^(a;) 99 ^(a;)'U^)dce. (43) 

J-TT 

Taking the first order differentials with respect to uq and u and optimizing them directly results 
in the optimal solution ( |4T] ). Inserting ( |4T] ) back into (43) and after some manipulations, the 
optimal asymptotic rate is then in 
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A. Method 1 


The structures of matrices {W,T,F) have the same constraints as in Section IV-A, except 
that now the matrices have infinite dimensions. Applying Szego’s theorem to the asymptotic 
rate reads, 


1 


l{W{uj),T{u),F{uj)) = lim 


K^co K 

+ ^ I Re{F*{oj){W{uj)H{u)-aT{u))}du 


dee 


(44) 


where 


Li{uj) = \F{uj)W{uj)\^{No + \H{uj)\^)+a\F{uj)T{uj)\^-2a\F{uj)\^Re{H{uj)W{uj)T*{uj)} 

and H{uj), F{u), W{uj) and T(ce) are Fourier series associated to the band shaped Toeplitz 
matrices H, F, W and T, respectively. 

Applying Szego’s eigenvalue distribution theorem, the Fourier series associated to M and M 
defined in @ and ( fTO] ) are 


M{u) = a‘^{M{u) + 1) —a. 

Define a (2 At — x 1 vector 

(^(a;)=[exp(— jAto;) ... exp(—jee) exp(j(z/+l)c(;) 


exp 


(j Atw) ] 


(45) 

(46) 

(47) 


a (2 At —z/) X 1 vector £i, and a (2 At —z^) x (2At —z/) Hermitian matrix £2 as 


a 


£1 = — / M{u)F*{u)(j){u)du, 


£2 = 


M{u)\F{u)\‘^(f){u)(j){u)) 


H 


-dee, 


(48) 


27rJ_^ l + |i^(a;)P 

where At is the tap length of T{u) and z/+l is the band size where matrix T is constrained to 
zero. Then we have Proposition with the proof given in Appendix H. 


Proposition 5. The optimal W{oj) for the asymptotic rate in (44) is, 

H*(u) 


Wopticu) = 


F*{u){No + \H{u)\^[ 


■{l + \F{uj)\^ + aF*{u)Topt{oo)), 


(49) 
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and when 0<a<l, the optimal T{oj) in (44) is, 

Topt(w) = 

With the optimal W(uj) and T{u), the asymptotic rate reads, 

Ii{F{u)), a = 0 


j(Wopt(a;),Topt(a;),F(a;)) = 


h{F{u)) + Si{F{u)), 0<a<l. 
The functions Ii{F{u)) and 5i{F{u)'^are defined as, 

IdFM) = 1+i r(log(l+|F(i^)P)+M(i^)(l + |F(u,)P))dw, 
5i(F(a;)) = -e^e~2ei. 


(50) 


(51) 


(52) 

(53) 


A closed form solution of the optimal Fipj) in (511 seems out of reach and a gradient based 
optimization is therefore used. Note that if we replace \F{u)\‘^ by 1(7(0;)|, (52) has the same 


form as ([40]) in Theorem]^ and the optimal solution of G{uj) is in closed form. However, as the 


optimal G{u) can not always be decomposed as G{u) = \F{u)f, (52) also needs a numerical 


optimization whenever G{u) is not positive real values for all u. The starting point initialization 
for the optimization procedure is similar as in the finite linear vector channels. In the ISI case, 
Method I is still not concave, an example is also provided in Appendix C. 


We next derive the differentials of the optimal asymptotic rate with respect to F{uj) in (51). 
With memory constraint u, the Fourier series associated to F is 


F{(^) = ^/fcexp(j/ca;) 


k=0 


where u is the memory length and is the non-zero element at the kth lower diagonal of F. 
The differential of Ii{F{u)) with respect to fk is 


V 


wi(f(^)) _ 1 r {Vi ^ I 1 

and the differential of 5i{F{uj)) with respect to fk is 

d6fiF{u)) 


f^exp{j{k-m)uj)duj, 


dfk 


-1^ -1 


dfk 


dfk 


^Similar to finite length linear vector channels, 6i{F{ijj)) in is only defined for aj^O, as when a = 0, M(u) = 0 and 
the inversion part in Si(F(uj)) is not well defined. The same comment holds for S 2 (G(cj)) in l|63|. 
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where 

n r 

M{uj)(j){uj)^exp{jkuj)duj 

— - — f* exv(T(k-m)oj)du 

9h - 2.L (1+|F(.)P)^ h ’ ’ 

The eonneetion between the optimal front-end filter and the interferenee eaneelation matrix 
as in Proposition also holds for ISI ehannels. The asymptotie version of Proposition that 
shows the relationship between the optimal 1 ^( 0 ;) and T{u) is stated in Proposition!^ 

Proposition 6. When 0 < a < 1, define the Fourier transforms 

1 

flfc = F*{u)Wopt{u;)H{uj)exp [-jku)du 

^ {-jkui)du, 

then ak = bk holds for k<-{o+l). 


Proof: In Appendix H, the optimal t in (94) satisfies 


^opt^2 ^ 1 • 


With the definitions of Si, £2 in (48), this is equivalent to 


M{uj)\F{u:)\^T,pfiuj)fi{uj' 


iH 


a 


2 ti 


-dee = —— / F{u)M{ijj)(j){ijj)^du. 


l + |F(a;)|2 

On the other hand, with ITopt in (49) and M(u), M{u) defined in (45) and (46), we have 


(54) 


— {F\i^)Wopt{u:)H{u:)-F*{u:)T,pt-{l + \F{u:)\^))exY>{-jku:)du: 




M(u,)F'(u,)T„p,(u,) 

a 


[l + \F{oj)\‘^)M{uj)jexp(—jkuj)duj. (55) 


Transforming (54) and (55) baek into matrix forms, we have that ( [T9l ) and (20) hold. Following 
the same arguments as in the proof of Proposition]^ F^{WoptH — Ropt) is banded within 
diagonals [—u,K—l]. Therefore we have 

^ f {F*{u)Wopt{uj)H{u)-F*{u)Topt{uj))exp[-jku)du = 0 


whenever k<—{i/+l), whieh proves Proposition]^ 
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B. Method 11 

The structures of matrices (V, -R, G) in ISI case have the same constraints as in Section [IV-B 
while the dimensions of these matrices are infinitely large. The type (a) R in Figure is not 
meaningful as N, K ^ oo. We use the same definition of z/r as in Proposition That is, z/r 
equals 0 or u, corresponding to type (b) or type (c) of R as shown in Figure Type (a) is not 
considered for ISI. 

Applying Szego’s theorem to Q, the asymptotic rate for Method II is 

I{V{uj),R{uj),G{uj)) = lim 1 /gmi(V,R,G) 

K^oo K 

+^j Re{{V{oj)H{oj)-aR{oj))}du (56) 

where 


L^iu) = \V{u)\‘^{No + \H{u)\^)+a\R{u)\^-2aRe{H{u)V{u)R*{u)} 


and V{u), R{u) and G{u) are Fourier series associated to the band shaped Toeplitz matrices 
V, R, and G, respectively. 

Define a 2 (A^r —z/r) x 1 vector 

= [exp(-jWRa;) ... exp(-j(z2R + l)a;) exp(j(z^R + l)a;) ... exp(jWRa;) ] (57) 


a 2 (A^r —z/r) X 1 vector and a 2 (A^r —z/r) x2(Ar —z/r) Hermitian matrix <^2 


a 


Cl = ^ / M{uj)xlj{uj)duj, 




: 1+G(a;) 


-dee, 


(58) 


where A^r denotes the tap length of Ropt{^), 2z^r+ 1 is the band size where R is constrained to 


zero, and M{uj) and M{oj) are defined in (45) and (46). Then with this notation, we have 


Proposition 7. The optimal V{u) for (56) is, 




(l + (7(ce) + ai?opt(i^)), 


and when 0<a<l, the optimal R{uj) for the asymptotic rate in (56) is, 

Ropt{uj) = -CfC^V(^^)- 


(59) 


(60) 
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With the optimal V (w) and R{uj), the asymptotic rate reads, 




RiGipj)'), a — 0 

J2(G(a;)) + 52(G(a;)), 0<a< 1. 


The functions Ii{G{u)) and 52 {G{uj)) are defined as, 


UG{u)) = l + ^y^"(log(l + G'(a;))+M(a;)(l + G'(a;)))du;, 
52(G(a;)) = -CfC^'Ci- 


( 61 ) 


(62) 

(63) 


The proof is given in Appendix I, where we also show that R{oj) is real and the matrix R has 
Hermitian symmetry. 

The optimal G{oj) for ( [6^ ean be solved in closed form from Theorem]^ However, a closed 
form solution for the optimal G{u) in (61) when 0 < a < 1 again seems out of reach and a 
gradient based optimization is used. The asymptotic rate I{Vopt{oj), Ropti(^),G{uj)) in Method 
II is also concave with respect to G{u), the proof is provided in Appendix J. 

Below we derive the differentials of the optimal asymptotic rate with respect to G{u) for 
0<a<l. As matrix G is Hermitian, the associated Fourier series G{uj) is 

ly u 

G{u) = gk exp{jku) = go + 2Re exp{jku) | (64) 

k=—i' k=l 

where gk is the element at the fcth diagonal of G. The differential of liVoptiui), Ropti^^), G{u)) 
with respect to gk is 

1 \ ..... .u OCs 


^ — j [M{uj)- 

dgk 271 J_ 


where 


dC2 

dgk 


■I 


1+G{u 

{ 00 ) 1/7 (uj)^ 


exp{jku)du + 


dgk 


exp (^jkoo'jdu. 


{l + G{u:)y 

The asymptotic version of Proposition]^ that shows the relationship between the optimal V (1 
and R{u) for ISI channels is stated in Proposition 


Proposition 8. When 0 < a < 1, define the Fourier transforms 


1 n 

flfc = Vopt{uj)H{uj)exp {-jku)du, 

(65) 

1 n 

.Ropt(ce)exp {-jkoj)doj, 

(66) 
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then ak = bk holds for |fc|>z/+i/R, where 1 /^ = 0 for Method II.b and = v for Method II.c. 


Proof: In Appendix I, the optimal r given in (99) satisfies, 

^optC2 = ~Cl • 


With the definitions of in ^2 


is equivalent to 




, 1 + G(u;) 


iH 


a 


-dee =-/ M(u)f(u)^duj. 


(67) 


( 68 ) 


On the other hand, with Kpt(i^) in (59) and M(cj), M{u) defined in (45) and (46), we have 


^ / {Vopt{oj)H(uj) -Ropt{(^) - {l + G{uj)))exp{-jku)du 


—— '^^^^^^^^+M{oj)(l+G{uj))')exp(—jkuj)duj. (69) 

^TT j ^ \ (y. / 

Transforming (68) and (69) baek into matrix forms, we have that ( [T7] ) and ( [28| ) hold. Following 
the same arguments as in the proof of Proposition Q VoptH—Ropt is banded within diagonals 
[—(z/ + z/r), z/ + z/r]. Therefore we have 

1 n 

^ J {Vopt{uj)H{u)-Ropt{uj))exp{-jku)duj = 0 

whenever |/c| >z/+z/r + 1, which proves Proposition ■ 

We provide an example to illustrate Propositionin Figure]^ with Method II.c and z/ = z/r = 1. 
The Proakis-C [ |43| channel is tested at an SNR of 10 dB and a, which represents the soft 
information feedback quality, equals 0.1, 0.4 and 0.8, respectively. Since r'R = l, bk as defined 
in 


is constrained to zero for 0</c<l. As can be seen, Ok as defined in ( |65) ) equals bk only 
for |fc| >2, and when \k\ = 2, and bk are not identical. This shows that with the optimal V{u) 
and R{u), the signal part along the second upper and lower diagonals that is not considered in 


G{u) shall not be perfectly canceled out. This behavior cannot be seen in [441, which treats 
LMMSE-PIC for ISI channels, since u = upi = 0. 


C. Method III 

Similar as finite length linear vector channels, we also investigate Method III for ISI channels. 


Applying Szego’s theorem to (38), the asymptotic rate reads. 


I{G{u)) = lim -/gmi(G) 

K^oo K 

= 1 + ^ [ (log(l + G(a;))+M(a;)(l + G(a;)))dw 


(70) 
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Proakis-C ISI Channel with Method II.c and 1^ = 1^ = ^ 



Fig. 5: Comparison between and for Proakis-C channel with Method II.c. 


where 


M{u) = 2Re{aW{u)H{u)C*{u) + W{u)H{u)-aC*{u)} ■ 




-a\C{u:)\^-l (71) 


iVo + |77(a;)|2 

and W (cj), €{ 00 ) are Fourier series associated to the band shaped Toeplitz matrices W, C defined 
in ( [^ and (36), respectively. As ( |7^ has the same form as (40) in Theorem replacing M{uj) 
by M{u) in (40), the optimal (7(a;) and asymptotic rate follows directly from Theorem]^ 


Remark 4. Proposition also holds for Method III with z/r = v. This comes from the fact that 
[{I+G)iWH~C)]\2. = 0. 


VI. SNR Asymptotics 

In this section, we analyze asymptotic properties of the CS demodulators as Nq goes to 0 and 
00 . As Method I is inferior to Method II in GMI sense, we limit our investigations to Method II 
and Method III. We start the analysis for finite length linear vector channels first and with the 
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following limits that can be verified straightforwardly: 


lim M/Nq = -iH^H 

Afo-s-O 




Moreover, we also have 


lim No{I+M) = 

No^oo 


lim M = — P, 

iVo-s-O 

lim M = —P. 

Nq^oo 


(72) 


(73) 


Note that when P^O, it must hold by the definition of 62 ( 0 ) in (25), that M is invertible. As 
Nq —)■ 0, M — P^ — P, which implies that P -< I. Therefore, we restrict the asymptotic SNR 
analysis to P~<I. 


Lemma 3. When Nq^O and 00 , the optimal G for (23) in Method II satisfies ( [77| ), and the 
following limits hold, 




lim [iVoGopt], = [H^H 


Nq^oq 


(74) 

(75) 


Proof: When P = 0, from Theorem]^ the optimal G for ( [23] ) satisfies ( [T7] ). Next we prove 
that, when P f 0, as Nq^O and A'o —)■ 00 , the gradient of 52 (G) in ( j^ converges to zero, 
therefore ( [T7| ) also holds. From ( [72j ), when Nq^O, M —)-0 and Nq^oo, M —)■—/. Therefore, 
by the definition of fl, 

lim d = flYec(MP) = 0. 

A^o—^-Ojoo 


This implies that the gradient dc{^ 2 ) in (89) (see Appendix F) converges to zero. Hence the 


differentials of /GMi(^opt, ^^opt, G) in (23) when P fO converges to the differentials when 
P = 0, which proves the first part of the lemma. 

From ( [T7] ) and ( |7^ , the limit ( f74| ) follows and 


lim [iV„ (I-(/ + G.P,)-')]„ = lim |]V„(/ + M)1 p = IH»H] 

jyjo^oo iVo^u 


(76) 


Therefore, /— (J+Gopt) ^ —)• Cnas Nq —)■ 00 . By the matrix inversion lemma, /— (J+Gopt) ^ — )■ 


Gopt as No ^ 00 , and combining this with (76) proves the limit (75). 


"A matrix A^B or a vector means the non-zero elements of A — B or a — b converges to zero. 


September 23, 2015 


DRAFT 










30 


Lemma 4. In Method II, with the optimal G, when the GMI increment S 2 {G) in (25) 

converges to zero with speed 0{1/ Noland when Nq^oo the GMI increment (52(G) converges 
to zero with speed 0 {Nq). 


Proof: As from (72) we have 


lim d/No = lim nvec(MP/No) = -flvec((H^H)-^P). 

No^O No^O ^ ’ 


Based on (73) and Lemma the below equalities holds, 


(/+Gopt) 


^ 2 (Gopt) = iVo—(f2(M 


iVn 






On the other hand, as Nq^oo, by the definition of fl, from (72) we have 


lim Nod = lim nYec{NoMP)= lim nYec(No{I+M)P)=nYec{H^HP). 

Nq^oo Nq^oo Nq^oo '' 


Based on (73) and Lemma the below equalities holds. 


<52(Gopt) = ^(iVod'')(f2(M*®(/+Gopt)-')f^^) \Nod) = 0{1/N^). 


Therefore, Lemma holds. 


Lemma 5. When No ^ 0 and oo, the optimal GMI in Method III is independent of P and 


converges to the optimal GMI for P = 0. Moreover, (74) and ( 75) hold. 


The proof is given in Appendix K. Combining Lemmas |3]0 and notieing the faet that Method 
III and Method II are equivalent when P = 0, we have the following Theorem 


Theorem 4. Assume that P ~< I, when A"o —)■ 0 and oo, the optimal GMI in Method III converges 
to the optimal GMI in Method III with P = 0. Moreover, the optimal GMI in Method II also 
converges to the optimal GMI in Method III with P = 0, with speed 0{1 /Nq) when SNR increase 
and 0 {Nq) when SNR decreases. The optimal G for both methods has the following asymptotic 
properties: 


lim |(Af„(/+G„„.))-']„ = [(H“/f 

iVo^U 


l-ll 


^Two scalars A and B as functions of a variable n converging to each other with speed 0{n) means that, there exists a 
constant C such that lim„_>oo 'n\A — B\<C. 
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lim [NoG,^tl=[H^Hl. 

Nq^oo 

From Theorem we know that, exeept for the ease where one of the elements in the diagonal 
matrix P is 1, the soft feedbaek information beeomes asymptotieally insignifieant for the design 
of the CS parameters. The reason is that, when Nq —)■ 0, ai is overwhelmed by the noise, 
while when Nq ^ oo, the optimal front-end filter will null out x sinee the filter ean perfeetly 
reeonstruet the transmitted symbols without using the side information. 


Remark 5. When Nq —)■ 0, the optimal CS demodulator is the EZF demodulator as defined in 
Example and when Nq —)■ oo, the optimal CS demodulator becomes the TME as defined in 
Example 


Next we extend Theorem to ISI ehannels. With ISI ehannels, as the same eonstraint P = 
al ~< I shall hold, we make the restrietion that 0<a< 1. Taking the traee on both sides of the 


equations in ( |72| ) and ( |73[ ), and Applying Szego’s eigenvalue distribution theorem, we obtain the 
following limits: 


/ TT 

■n 


M{u) 

Nq 


doj 



/ TT 

Nq{1 + M{uj))du 

-n 




\H{uj)\‘^du, 


/ TT 

M(a;)da; = a(a — 1), 

■TT 

/ TT 

M(uj)du = —a. 

-n 


(77) 


With the above limits in ( [77| ), the SNR asymptotie properties for ISI ehannels are presented in 
Corollary whieh is an asymptotie version of Theorem when the ehannel matrix H and CS 
parameters are band shaped Toeplitz matriees with infinite dimensions. The detailed proof is 
following the same analysis as for the finite linear veetor ehannels and omitted. 


Corollary 1. Assume that 0 < a < 1, when Nq ^ d and oo, the optimal GMl in Method 111 
converges to the optimal GMI in Method III with a = 0. Moreover, the optimal GMI in Method 
II also converges to the optimal GMI in Method III with q; = 0, with speed 0{1/Nq) when SNR 
increase and 0 {Nq ) when SNR decreases. The optimal G for both methods has the following 
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asymptotic properties hold for \k\<i': 

lin, r_^ 

A^o^oy_iVo(l + GoptM) ^ 


exp(—j/ca;)dcij 


/ 


— TT 


'TT 



Nq^oo 


p7T 

lim / A^oG'opt(i:^) exp(-j/ca;)dw 


/_ 


■TT 


I if (cj) p exp (— j fca;) do; 


— TT 


VII. Numerical Results 


A. GM/ Simulation for MIMO and ISI Channels 

We first evaluate the GMIs for all CS demodulators with various feedback quality and with 
memory constraint u = 1. In the 5x5 MIMO case, all channel elements are assumed to be 
independent identically distributed (IID) complex Gaussian and the received signal power at 
each receive antenna is normalized to unity. For the ISI case, we consider IID complex Gaussian 
channels with tap length L = 5 and the total average power is normalized to unity. We simulated 
10^ channel realizations for each signal to noise ratio (SNR). The GMIs are compared with the 
GMI of the static CS demodulator used in (denoted as “StaticCS” in the figures) which is 
equivalent to our case when P = 0. The channel capacitjj^is also presented for reference. 

The simulation results are plotted in Figure and Figure |7} When the quality of the soft 
information improves beyond P = 0, Method Il.b performs the best among all CS demodulators 
since it has the most DoFs. Method II.c is the worst among Method I and Method II CS 
demodulators. Method I is slightly worse than Method II.a, which is because although the 
interference cancelation matrix R is type (a) in both cases, R in Method II.a is more general 
than in Method I since in Method I P is constrained to R = F^T. The GMIs of Method III are 
inferior to Method II which is expected. However in the ISI case, it slightly outperforms Method 
II.c, which is because R in Method III has more doFs than Method II.c. The simulation results 
show consistent GMI increments for all CS demodulators when the feedback quality improves. 
When P increases from P = 0 to the ideal case P = /, the channel capacity becomes inferior 
to the GMIs as the pair {y, x) becomes superior to {y, x) for information transfer. 

We also evaluate the SNR asymptotic properties described in Theorem with IID complex 
Gaussian 5x5 MIMO channels. As showed in Figure the GMIs of Method II.c and Method 

*The channel capacity is calculated without any channel state information (CSI) at the TX side, therefore it does not contain 
any water-filling. 
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5x5 MIMO IID Complex Gaussian Channel 






Fig. 6: GMI simulation results for 5x5 MIMO IID complex Gaussian channels with v = l. 

Ill both converge to Method III with P = 0. Moreover, the GMIs of CS demodulators converge 
to EZF at high SNR and TMF at low SNR, which is well aligned with Theorem 

Furthermore, in order to investigate the impact of permutation of the channel matrix, we 
simulate the GMI with permutations for 5x5 MIMO with Method II.c and u = l. The test set up 
is the same as in Figure In order to obtain the optimal permutation, we exhaust all 5! = 120 
possible permutations of the channel matrix for each channel realization, then we average the 
highest GMIs over all realizations. We also investigate an energy based method to select the 
optimal permutation. The energy based permutation is to choose the permutation over all 60 
possible permutations (due to the Hermitian property of the Gram matrix H^H, the search 
space is reduced) that maximizes the total energy of the 2p+1 diagonals in the Gram matrix 
where G can be non-zero. In Figure it can be seen that the energy based permutation of the 
channels has reduced the gap of the GMIs between the non-permuted channels and the channels 
with optimal permutation. 

We next turn to link-level simulations with a Turbo code p?l where the decoder uses 8 
internal iterations. A single code block over all transmit symbols is used. The optimal scaling 
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IID Complex Gaussian Channel with L — b 






Fig. 7: GMI simulation results for IID complex Gaussian ISI channels with z/ = l. 


factor of the input and output extrinsic information is found by experiments to be 0.6 for the 
MAP demodulator. We use the same scaling factor for the CS demodulators. At each SNR point 
coded blocks are simulated, while three global iterations are used between the demodulator 
and the decoder. 


B. ISI Channels with Turbo Code 

As an ISI channel a more natural setting for the CS demodulators, we start with simulations 
for ISI channels. The block error ratio (BLER) performance is used as metric for evaluating 
performance. The transmitted symbols are QPSK and 16QAM symbols. We choose A^t = ^r = 
8 L for T{oj) and R{oj) in Method I and Method II, respectively, where L is the channel length. 
The number of the front-end filter taps in W(u) and V(uj) are also set to 8 L. 

In Figure [T^ we plot the performance for Proakis-B channel [431 with QPSK symbols and 
a (1064, 1600) turbo code. The three taps are = [ 0.407 0.815 0.407 ]■ As can be seen, the 
gap between the LMMSE-PIC and MAP demodulators is around 3-4 dB for all three iterations. 
With the proposed CS demodulators, the gap is reduced to less than 0.5 dB. Moreover, Method 
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GMI Asymptotics with 5x5 MIMO IID Complex Gaussian Channel 



Fig. 8: GMI SNR asymptotic results of 5 x 5 MIMO IID complex Gaussian channels with 

u = l. 

II.c performs the best while Method I is quite close to it. Method Il.b and Method III are 
inferior. Although it is not consistent with the GMI simulation results, where Method Il.b has 
the highest GMI, it can be well explained: The GMI only measures the achievable rate under 
ideal conditions that optimal detection and decoding are utilized which is impractical of most 
practical systems. Further, the GMI is evaluated under the assumption that the pair (a;, x) is 
jointly Gaussian which is not the case in practice. Therefore the GMI can not fully reflect 
the final performance. However within the same receiver structure (Method I, Method Il.b and 
Method II.c have different receiver structures since the CS parameters are not the same), when the 
feedback quality increases, the GMIs also increase for each individual method. Nevertheless, the 
BLER results in Figure [T^ reveal two interesting facts: Firstly, through iterations with optimized 
CS parameters, the BLER performance improves dramatically and secondly, in Method II the 
interference cancelation matrix R shall be zero inside the band where G has non-zero elements. 

Next in Eigure [TT] we plot the performance for the Proakis-B channel with 16QAM symbols 
and a (1064, 1920) turbo code. The conclusions are almost the same as with QPSK symbols, 
except that Method I performs slightly better than Method II.c with two and three iterations. 
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5x5 MIMO IID Complex Gaussian Channel with Method II.c 



Fig. 9: Optimal permutation results for 5x5 MIMO IID eomplex Gaussian ehannels with 

Method Il.e and z/ = 1. 


In Figure [T^ we plot the performanee for the EPR4 [46| ehannel with QPSK symbols and a 
(1064, 1600) turbo eode. The four taps are h,= [ 0.5 0.5 —0.5 —0.5 ]• In this ease, the LMMSE- 
PIC demodulator is quite effieient as with three iterations the performanee gap eompared with 
the MAP demodulator is around 1.5 dB at 10“^ BEER. We tested the CS demodulators both 
for z/= 1 and z/ = 2. With i> = l, the performanee gain of CS demodulators over EMMSE-PIC 
is around 1 dB at the first iteration and 0.4dB after three iterations at 10“^ BEER. With z/ = 2, 
the performanee gain of the CS demodulators over EMMSE-PIC is more than 1 dB after three 
iterations and the gap eompared with the MAP is less than 0.5 dB. Eor all CS demodulators the 
performanee is quite elose to eaeh other, but Method Il.e is slightly better than the others. 


In Eigure 13 we plot the performanee for the EPR4 ehannel with 16QAM symbols and a 
(1064, 1920) turbo eode. The performanee of all CS demodulators is quite elose to eaeh other 
and outperform EMMSE-PIC with more than 1 dB. Moreover, Method I performs slightly better 
than the others after three iterations at high SNR. 

In Eigure [T^ we plot the performanee for the Proakis-C ehannel with QPSK symbols and a 
(1064, 1620) turbo eode. The five taps are h, = [ 0.227 0.46 0.688 0.46 0.227 ]• We test the CS 
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Proakis-B ISI Channel with Turbo code (1064,1600), QPSK 




LMMSE-PIC-liter 
LMMSE-PIC-2iters 
LMMSE-PIC-3iters 
Method I i/= 1-liter 
Method I i/=l-2iter 
Method I i/=l-3iter 
Method Il.b = 1-liter 
Method Il.b i/=l-2iters 
Method Il.b i/=l-3iters 
Method II.c t' = l-liter 
Method II.c t' = l-2iters 
Method II.c t' = l-3iters 
Method III 1-liter 

Method III l-2iter 

Method III i/ = l-3iter 
MAP-liter 
MAP-2iter 
MAP-3iter 


Fig. 10: Performance evaluation with Proakis-B channel with QPSK Symbols. 


demodulators both for z/ = 1 and z/ = 2. The conclusions are similar to the other channels that 
have been tested with QPSK symbols. Method II.c performs better than Method I and Method 
Il.b while Method III is slightly inferior. With i^ = l, the CS demodulators outperform LMMSE- 
PIC with more than 2 dB in all three iterations. With u = 2, the gap compared with MAP is 
reduced to less than 1 dB while LMMSE-PIC has a gap to MAP that is up to 10 dB. 


C. MIMO Channels with Turbo Code 

Next we evaluate the BLER performance for MIMO channel. An interesting case would be 
when the receive antenna number is less than the transmit antenna number, i.e., N <K in ffl. 


In this case the EMMSE-PIC demodulator will fail [ [47| | at the first iteration due to the lack of 
receive diversity. We tested 4x6 MIMO IID complex Gaussian channels with QPSK symbols 
and a (1064, 1800) turbo code. In Eigure [TS] we plot the performance comparison with 1, 2 
and 3 global iterations. As can be seen. Method II.c performs better than Method I, Method 
II.a and Method Il.b both for v = 1 (right figure) and i/ = 3 (left figure). In the first iteration, 
the performance of all methods are almost overlapped with each other and cannot be easily 
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10 ° 

10 "' 
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10 '“ 

12 13 14 15 16 17 18 19 20 21 22 

SNR [dB] 

Fig. 11: Performance evaluation with Proakis-B channel with 16QAM Symbols, 
distinguished from the plot. 

The performance of the LMMSE-PIC and MAP demodulators are compared with Method II.c 
and Method III in Figure LMMSE-PIC is inferior, especially at the first iteration where there 
is no soft information available. Method II.c with v = 1 improves the performance more than 
1 dB over LMMSE-PIC with the same number of iterations. With u = 3. Method II.c is quite 
close to MAP, with less than 1 dB gap at 10“^ BEER. Again in both cases Method III performs 
quite close to Method II.c. 

The impact of permuting the channel matrix is also simulated with Method II.c with u=l. It 
can be seen that, in Eigure the energy based permutation outperforms the performance with 
no permutation, which is aligned with the GMI simulations that are presented in Eigure 

Einally we remark that, for the sake of complexity savings, both for finite linear vector channels 
and ISI channels, the parameters of CS demodulators do not need to be updated through all 
iterations. Once the feedback information quality is good enough and the parameter P or a are 
close to ideal, the optimal CS parameters can be kept unchanged in remaining iterations. 


Proakis-B ISI Channel with Turbo code (1064,1920), 16QAM 
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EPR4 ISI Channel with Turbo code (1064,1600), QPSK 
- 
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Method 1 1 ^ = 1-2^613 
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Fig. 12: Performance evaluation with EPR4 channel with QPSK Symbols. 


VIII. Summary 

In this paper we considered the design of CS demodulators for linear channels that use a trellis 
representation of the received signal in combination with interference cancelation of the signal 
part that is not appropriately modeled by the trellis. In order to reach a trellis representation, a 
linear filter is applied as front-end. It is an extension of the well studied CS demodulators to 
iterative receivers. We analyzed the properties of three different approaches for designing such 
optimal CS demodulators. In the used framework, there are three parameters that need to be 
optimized. Based on a generalized mutual information cost function, two of these are solved for 
in closed form, while the third needs to be numerically optimized except for the last method 
where we constructed it explicitly at the cost of a small performance loss. A simple gradient 
based optimization is used and turns out to perform well. Numerical results are provided to 
illustrate the behavior of the proposed CS demodulators. In general. Method II.c which is based 
on the Ungerboeck model outperforms Method I that is based on the Forney model. However 
Method I performs slightly better than Method II in some cases with 16QAM symbols. Method 
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EPR4 ISI Channel with Turbo code (1064,1920), 16QAM 



Fig. 13: Performance evaluation with EPR4 channel with 16QAM Symbols. 


II has the advantage over Method I that the optimization procedure is concave. Furthermore, the 
suboptimal Method III performs close to Method I and Method II while it has all parameters in 
closed form. An interesting result is that the interference cancelation matrix should not cancel the 
effective channel perfectly outside the memory length. We also analyzed asymptotic properties 
and showed that Method III converges to Method II asymptotically when the noise density goes 
to zero or infinity. 


Appendix A: Derivation oe the GMI 

By making the eigenvalue decomposition QA.Q^ = G and letting s = Q^x. As x is assumed 
to be zero mean complex Gaussian random vector with covariance matrix I, we can write 
p{y\x, x) in Q as 

p{y\x,x) = exp(2Re{ As), (78) 
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Proakis-C ISI Channel with Turbo code (1064,1600), QPSK 



SNR [dB] 


Fig. 14: Performance evaluation with Proakis-C channel with QPSK Symbols. 


where d = Q (Vy — Rx). We can now evaluate 


p{y\x) = j p{y\x,x)p{x)dx 

= Jex.p(2Re[s^d]—s^As) exp{—s^s)ds 


N 

n 

k=l 


exp 


1 + Afc Vl-|-Afc/ 


I4P A 


where is the fcth diagonal element of A and is the kth entry of d. Taking the average over 
y and x gives 

- E[log{p{y\x))] = log(det(/+G)) -Tt{L{I+G)-^) 
where the matrix L = E[Qdd^Q^] is given by 

L = V{NoI+HH^)V^-VHPR^-RPH^V^ + RPR^. 

On the other hand, we have 

-E[log(p(?/|al,al))] = Tr(G)-2Re{Tr(Vi^-i^P)}. 
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4x6 MIMO with Turbo Code (1064,1800) 



4x6 MIMO with Turbo Code (1064,1800) 
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Fig. 15: Performance evaluation between Method I, Method Il.a, Method Il.b and Method II.c. 


Combining the two expectations, the GMI reads, 

Igmi(V,R,G) = log(det(/+G))-Tr(X(/ + G)-i)-Tr(G) + 2Re{Tr(Vif-RP)} 
= log(det(/ + G))-Tr(G)+2Re{Tr(V/f-PP)} 

-Tr ((/+G) (V[HH^ + NqI] - 2Re{ VHPR^ } + RPR^)). 


Appendix B: The Proof of Proposition H] 

As the formula of GMI in ([^ is quadratic in W and no constraints apply to W, taking the 
gradient of lGMiiW,T,F) with respect to W and setting it to zero, the optimal W is given 
in (11). Inserting Vl^opt into ([^ gives, after some manipulations, 

lGMi{Wopt,T,F) = iC+log(det(/+P«P))+Tr(T«P(/+P^P)-^P«TM) 

+Tr(M(/+P^P))+2Re{Tr(PMP“T)}. (79) 

where M and M are defined in (|^ and (10). 

If P = 0, ( |7^ equals 

Ji(P) = R:+log(det(/+P^P))+Tr(M(/+P^P)). 
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4x6 MIMO Channel with Turbo code (1064,1800), QPSK 

^ 



11 11.5 

SNR(dB) 


Fig. 16: Performance evaluation between the LMMSE-PIC, Method II.c and the MAP. 


In this case, there is no soft information available and the matrix T is not included in the 


formula. When P^O, the terms of Jgmi in ( f79| ) related to T are 

f{T) = Tr{T^F{I+F^F)-^F^TM)+2Re{TT{PMF^T)}. 


Let tk denote the kih. column of T, but all elements in rows [/c, min(fc+i/, K—1)] removed, and 
define the column vector t=[t^ tj ... then by the definition of the indication matrix 

n, we have 

t = f2vec(T). 


Similarly, let Zk denote the fcth column of the matrix FMP but with all elements in rows 
[k,mim{k + v, K — 1)] are removed, and define the row vector z = [z^ zj ... zJ^_^Y, then we 
have 

z = nvec{FMP) = n ( {PM *) ® ) vec (F). 

Finally, define a Hermitian matrix Fi as 

Bi = n{M*^{F{I+F^F)-^F^))n'^, 
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4x 6MIM0 Channel and Method II.c with Permutation 



Fig. 17: Performance evaluation with channel permutations of Method II.c. 


with that, we can rewrite /(T) as 

/(T) = t^Bit+2Re{z^t}. 

Taking the gradient with respect to t and setting it to zero yields, 

- -1 

topt ~ 


(80) 


Transferring topt back into Topt given the optimal T in (12) and inserting this into /(T) gives 

/(TopO = 

Thus, with the optimal W and T, when the GMI equals 

/GMi(VFopt,Topt,F) = K+\og{det{I+F^F))+TT{M{I+F^F)) 

-vec{F)^D^(n{M*®{F{I+F^F)-^F^))n^y^Dvec{F). 

where = 
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Appendix C: Non-Concavity Examples of Method I 

We give examples to demonstrate the non-eoneavity of Method I for MIMO and ISI ehannels 
with assuming that P = I and a = 1, respeetively. The memory length u = 1 and the noise 
density Nq equals 1 in both oases. A 5x5 MIMO channel and the Proakis-C channel are used. 


Example 4. MIMO: 


2 0 -3 5 4 


4.94 4.45 

0 0 

0 


2.03 6.17 

0 0 

0 

-5 2 -10 2 


0 

0.21 3.85 0 

0 


0 

5.22 3.56 0 

0 

2 -4 3 3 3 

,Fi = 

0 

0 

5.56 1.76 

0 

, F 2 = 

0 

0 

7.43 0.73 

0 

-I -5-4 12 


0 

0 

0 0.61 

7.10 


0 

0 

0 4.98 

4.32 

0-2055 


0 

0 

0 0 

2.79 


0 

0 

0 0 

10.11 


Example 5. ISI: 


h = 


0.227 0.460 0.688 0.460 0.227 


,fi 


0.1606 0.9009 


,/2 


0.2230 0.2035 


The /GMi(VEopt, Topt, i^) given in ( [13] ) as a function P is plotted on the left in Figure 18 
while the I(lVopt(u:),Topt(u:), F{u)) given in (511 as a function of F{u) is plotted on the right. 
If f^GMi(^^Eopt 7 T'opt; and f^IEopt(^) 5 ^opt (^)7 .^(^)) ure concave or convex, the blue curves 
lie above or below the black curves, which clearly does not hold in our examples. 


Appendix D: Derivation of the Gradient in Method I with Finite Finear Vector 

Channel 


In this section we derive the first order differential of the GMI given in ( [T3| ) with respect to F. 
In order to utilize the differential with respect to a matrix, we use the a-differential as defined 
in |48|. Assume a matrix Yn,k with dimension NxK and a matrix Xm,s with dimension 
MxS, define dxY as the a-differential of Y with respect to X. Furthermore, define y£ and X£ 
[yi y2 ■■■ yNK ] =vec(l^)'^ and [ X 2 ■■■ xms ] =vec(X)^, the a-differential dxY 


is defined as 


dxY 


dvec(Y) 

dvec{X)^ 


dyi dyi 

dxi 8x2 

dm dm_ 

dxi 8x2 


8yNK dpNK 
8 xi 8 x 2 


dyi 

dxMS 

dy 2 

dxMS 


dpNK 
dxMS - 
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5x5 MIMO 


Proakis-C ISI Channel 




t t 

Fig. 18: Examples for the non-concaveness of Method I for 5x5 MIMO channel and 

Proakis-C ISI channel. 


The reason for adopting the a-differential is because it keeps the chain rule and the product 
rule. At first we introduce an NKxNK permutation matrix Zn,k, which satisfies the condition 

vec(Y'^) = Z N^K'^ec{Y). 

It is easy to verify that Z^j^ = Zk,n, and when iV= 1 or A" = 1, is a vector and vec(l^"'") = 
vec(Y), hence Zi^^i = Im and Zi^k = Ik- Furthermore, by definition we have 


djr(F) = djr(vec(F))=/ 
dF{F^) = dF (vec(F^))=0. 


We start by reviewing a few properties [481, |49| of a-differential that will be used later. Assume 
that both matrix X and Y are functions of F and the dimensions are specified by the subscripts 
associated to them, the below equations hold: 

dF(xt^) = -(x-F^_F<sx-;lF)dFXK., 


\K 


dpiY n.kX I 


= {X 


)lN)dFY + {Is®Y N,K)dFX K,i 


dF{\og{det{XK,K))) = yec{Xj^'^j^)'^dFX 


K,K 


dF^X N,K®X M,s) — {I K® Z s,N®I m){I n,K 

+ {I K®Z s,N®I m)XcX)®I Ms)dFX M,S- 
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The a-differential of Ii{F) with respect to F is 

dp{I^) = vec{{I+F^F)-^f{lK®F^)+vec{F*M^f 

= vec{FM+F{I+F^F)-'^)^. (81) 

Define a KxK matrix B = F{I + F^F)-^F^ and an matrix U= {n{M* 
the a-differential of Si{F) with respect to F is 

dp{Si) = -vec{F)^D^ {{vec{FfD'^) 0 / 5 ) dpiU)-vec{F)^D^UD, (82) 


where 

djr(n) = -(n^(8)n)dF(ri(M*0B)ri^) 

= -(if' ^n){fl^fl)dF{M* ^B) 

= —(^(if (IZIp)(vec(]^ ^®Ip2)dpB (83) 

and 

dF(B)=dF{l-(I+FF^)-^) 

= {(I+FF^)-^) 0 {(I+FF^)-^)dF(I+FF^) 

= {(I+FF^)-^)®{(I+FF^)-^)(F*®Ik) 

= {F*(I+FF^)-^)®(I-B). (84) 

Define a KxK matrix F = (I+F^F)~^F^ and a K'^xK'^ matrix 

^ = (IK®ZK^K®IK){wec(M )®If2^, (85) 


by combing (81l-(85l, we finally have, when P^O, 


dp (-^GMl(^^opt? ^opt: -^)) dF(h)+dF(6i) 

= Yec{FM+F^)^-Yec(F)^D^UD 
+Yec(F)^D^{(rfUDYec(F)f®{Un)yi{F^®(I-B)). 
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Appendix E: The Proof of Proposition [3] 

As the formula of GMI in Q is quadratic in V and no constraints apply to V, taking the 
gradient of /gmi(T^, -R, G) with respect to V and setting it to zero, yields the optimal V given 
in (21). Inserting Vopt into ([^ gives, after some manipulations 

IcMiiVopt, R,G) = ir+log(det(/ + G))+2Re{Tr(PMi^)} 

+Tr(M(/+G)) +Tr((/+G)-^PMP“) 

where M" and M are defined in (|^ and (10). 


If P = 0, ( [8^ equals 

/ 2 (G) = ir+log(det(/ + G))+Tr(M(/+G)). 


( 86 ) 


When Pj^O, the terms of /GMi(T^opt, R, G) in (86) related to R are 

^(R) = 2Re{Tr(PMP)} +Tr((/ + G)“^RMP^). 

Let Vk denote the /cth column of R, but where all elements in rows [max(0,/c —z/r), min(/c + 
z/r, A' —1)] are removed, and define the column vector p = [ Pq ... , then we have 


r = f2vec(R). 


Moreover, let denote the kth column of the matrix MP but with all elements in rows 
[max(0, fc — z/r), min(/c+z/R, K—1)] are removed and define the vector d = [dj dj ... d^_p]'^- 
From the definition of d, we have 


d = flvec{MP). 

Defining a Hermitian matrix B 2 

B2 = ^{M*®{I+G)-^)^l^, 

we can write /(R) as 

g{R) = r^B 2 r + 2 Re{d^r}. 

Therefore the optimal r is 

ropt = -R2^^^- (87) 
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Transferring ropt back into i?opt gives the optimal i? in (22) and inserting this into g{R) gives 


g{Ropt) = -d^B2^d. 


Thus, with the optimal V and R, when P^O the GMI equals 

opt 7 -Ropt 7 /f+log(det(/+G)) +Tr(M(/+G)) 
-d^ {n {M* ® (I+G)-^) ~^d. 


Appendix F: Derivation oe the Gradient in Method II with Finite Linear Vector 

Channel 


Now we calculate the a-differential of /GMi(t^opt,-Ropt, G) given in (23) with respect to G 
when P^O. Taking the a-differential of hiG) with respect to G yields, 


dcih) = vec((/+G)”GM)H. (88) 

Define an S'x S' Hermitian matrix $= (f2(AT ®(/ + G)“^)r2^) ^ and taking the a-differential 
of 52 (G) with respect to G yields, 


da{ 62 ) = -{d^(^d^)dG{^) 

= ((dT$T)®(d^$))(ri0ri)dG(M*0(/+G)-^) 

= {{d^^'^n)®{d^^n))^dG{{i+G)-^) 

= -((dT$Tri)®(d«$ri))^'((/+G)-T®(/+G)-') 


where is defined in 


. Combining (88) and (89), we can obtain 


5G(/GMl(t^opt,-Ropt, G)) — dG{l2)+dGi52) 

= vec{{I+G)-^ + M)^ 

-((dT$^f2)0(d^$f2))^'((/+G)-T®(/+G)-'). 


(89) 
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Appendix G: The Concavity Prooe oe Method II with Finite Linear Vector 

Channels 


When P = 0, as log(det(/+G)) is concave [50| andTr(M'(/+G)) is linear in G, the function 


hiG) in (24) is concave with respect to G whenever J+G is positive definite. 


The concavity when P ^0 can be deduced from the composition theorem in p0| Chpater 
3.6]. For a positive definite matrix X, d^X~^d is convex and non-increasing (with respect 
to the generalized inequality for positive definite Hermitian matrices, see [ [50| , [ |5T| ) for any 
column vector d. Furthermore, since J+G is positive definite, (/+G)“^ is convex. As M -<0 
X = fl(M <S){I+G)~^'jfl^ is concave in G . 

By the composition theorem, d^ (fi (^M* 0 [J-fG] ^d is convex, and ^ 2 ( 6 ^) is then 

concave. Therefore the function /GMi(T^opt, ^^opt, G) in ( [^ is concave with respect to G 
whenever J+G is positive definite. 

Appendix H: The Proof of Proposition [5] 

The Fourier series associated to the Toeplitz matrix W is 

00 

W{u)= ^ Wfcexp(jfca;), 

k=—oo 


and the differential of I{W{uj),T{uj), F{uj)) in (44) with respect to Wk (where to is fixed) is 
dl _ 1 r \F{u)\%No + \H{u)\^)W*{u) 


dwk 


27r 


l + \F{uW 

a\F{u)\‘^H{u)T*{u 




exp{jkuj) dee 

exp (^jku)du. 


(90) 


Since (90) should equal zero for all k, the optimal W{oj) is given in (49). Putting IFopt(iv) back 
into ( |44| ) yields, 

1 


J(iyoptM,T(a;),F(a;)) = l-t -^J Re{F*{uj)T{u:)M{u)}duj + —j (log(l + |F(a;)|2) 

M{u:)\T{uj)F{uj)\^ 


l + \F{u;W 


+ M(ce)(l + |F(a;)| 2 ) dee 


(91) 


where M(ce) and M(ce) are defined in (45) and (46). 


When a = 0, the GMI in (^H) equals and when 0 < a < 1, the terms related to T(ce) in 
( |^ are 

ftnu,)) = ^ £Re{F-HrHMH}d^+4 (92) 
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As the elements of the main diagonal and the first v lower diagonals of matrix T are eonstrained 
to zero, we define the veetor i that speeifies the Toeplitz matrix T as 

t [ f—W t • • • ^—1 ^u+1 ■ ■ ■ ]’ 


and with defined in (47), the Fourier series T{uj) with a finite tap length A^t is 


TH = E tkexp(^jkoj) = tcp^Lj). 


(93) 


Furthermore, with Si and £2 defined in (48), (92) ean be rewritten as 


f{T{u)) = f£2^^ + 2Re{f£i}. 


Therefore the optimal t is 


1 _ 

^opt ^2 • 


(94) 


Putting topi baek into (91 )-(93), the optimal T{u) is given in (50) and I{W{uj),T{u), F{u)) for 


the optimal W{u) and T(a;) is given in (51). 


Appendix I: The Proof of Proposition [7] 
The Fourier series assoeiated to the Toeplitz matrix V is 

00 

V{u)= ^ Ufcexp(j/ca;), 


k=—oo 


and the differential of I{V{u), R{u),G{u)) in (56) with respeet to Vk (where u is fixed) is 

1 r {No+\h{u)\^)v*{u) 


dl _ _ 

dvk 


exp{jkoi}) dee 


l+(7(a;) 

1 rfrT/ \ ctH{uj)R*{u)\ , X, 


(95) 


Sinee (95) shall equal zero for all k, the optimal V{u}) is given in (59). Putting 14pt(ce) in (59) 
baek into ( [56] ) yields, 

1 


7(V^pt(<^), R(^u), (7(ce)) — l + ^y Re|ilT(ce)i?(a;)}da; +— J ^log(l + (7(a;)) 

, M(a;)|R(^)P 

l + (7(a;) 


-M{u) (l + (j'(ce)) jdo;, 


(96) 


where M{oj) and M{uj) are defined in (45) and (46). 


September 23, 2015 


DRAFT 



















52 


When a = 0, the GMI in (96) equals (62) and when 0<a< 1, the terms of /(Kpt(i^), R{^), ^*( 0 ;)) 


related to R{u) in (96) are 


g{R{uj)) = -[ Re{M{u)R{oj)}du + — 

^ J-TT 27r 

Define the veetor f that speeifies the Toeplitz matrix R as 


1 + G(a;) 


-dee. 


(97) 


r = [ r_Nn ■ ■ ■ r^^+i ■ ■ ■ 


and with ^jJ{uJ) defined in (57), the Fourier series R{u) with a finite tap length is 


R{uj) = ^ rkexp{jkuj) = r'ijj{uj) 


(98) 


where 2z/r + 1 is the band size that R is constrained to zero. With and <^2 defined in (58), 


(97) can be written as 


g{R{uj)) = rCa^’^+SRe {rCi} . 

Therefore the optimal f is 

hopt = -C?C2^- (99) 

This shows that Fopt has Hermitian symmetry as G{u), Mioj) and M{oj) are all real valued. 


thus i?opt(i^) is real. Putting ropt back into (96)-(98), the optimal R{oj) is given in (60) and 


I{V{uj),R{uj)^G{uj)) for the optimal V{u) and R{u) is given in (61). 


Appendix J: The Concavity Proof of Method II with ISI Channels 


In order to prove that J(Kpt(a;), i?opt(iv), G{u)) in (61) is concave with respect to G{u), it is 
sufficient to prove that CfC 2 ^Ci is convex with respect to G{uj). For a positive definite matrix 
C 2 > Cf C 2 ^Ci is convex and non-increasing (with respect to a generalized inequality for positive 
definite Hermitian matrices) in G{u) for any vector and with arbitrary finite tap length N^. 


As matrix M is negative definite, <^2 in (58) is concave with respect to G{uj) under the constraint 
that J-fG is positive definite. Hence CfC^^Ci is convex in G{uj) by the composition theorem 
pO} Chapter 3.6]. 
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Appendix K: The Proof of Lemma [5] 

With the GMI in Method III given in (38), from Theorem]^ the optimal G satisfies, 

[{I+Gopt)-\= -[M]u. 

Notiee that, when P = 0, Method III and Method II are equivalent sinee M = M. Henee in 
order to prove Lemma it is suffieient to prove that [M]u converges to [M]u as A"o—)-0 and 
oo in Method III. 


When P~<I, Ck is positive definite as in (33) and as Nq —>■ 0, 




-U-l ttH 






Therefore with W and C defined in (32)-(36), 


WH = I- No{H^H)-^ + 0{N'^), 

C = [WH]\, = -No[{H^H)-X + 0{Nl). 


With (100) and M defined in (39), it can be verified that. 


lim |M/iV„]„ = -[(ijHH-)-‘] 

Nq^O 


On the other hand, when No —)■ oo, from (|32|)-(|39|), 

NoW = H^{HCkH^/No + I)-^ = + O{l/No), 

NoC = [WH]\, = [H^H]\, + O{l/No), 


With (101) and M defined in (39), it can be verified that. 


lim [Noil+M)], = [H^H 


Nq^oo 


( 100 ) 


( 101 ) 


Therefore, from (72) [M]^ converges to [M]i, as No^O and oo, which completes the proof. 
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